Propagating metadata associated with digital video

ABSTRACT

Some embodiments provide a method for processing metadata associated with digital video in a multi-state video computer readable medium. The method specifies a set of rules for propagating the metadata between different states in the video computer readable medium. It then propagates the metadata between the states based on the specified set of rules.

FIELD OF THE INVENTION

The present invention is directed towards propagating metadata associated with digital video.

BACKGROUND OF THE INVENTION

When first designed, the digital video format was truly revolutionary, not only in its treatment of the storage of video/audio media, but also in the transport used to move media between devices. With a digital tape format, timecode was no longer relegated to special purpose control tracks, but carried along with the relevant video and audio data in a cohesive frame based unit. A pure digital connection (e.g., a connection over SDI or FireWire) allows for data to be transferred between devices with no information loss. Beyond simple timecode, extra space is set aside in each frame to carry other types of useful embedded “metadata”, including such information as camera configuration/exposure settings, time of day, scene breaks, etc.

In many modern non-linear editors (NLE's), most of this auxiliary information is ignored. Moreover, there does not exist a coherent approach for propagating metadata across different stages of the editing pipeline. Therefore, there is a need in the art for an innovative method for propagating metadata across different stages of a video processing system. Ideally, this method should allow an editor to provide different rules for propagating different types of metadata.

SUMMARY OF THE INVENTION

Some embodiments provide a method for processing metadata associated with digital video in a multi-state video processing system. The method specifies a set of rules for propagating the metadata between different states in the video processing system. It then propagates the metadata between the states based on the specified set of rules.

For instance, in some embodiments, the method exports digital video to a storage. This method initially determines whether a set of metadata of the digital video has an associated set of rules for exporting the set of metadata. If so, it then exports the set of metadata to the storage based on the set of rules.

In some embodiments, the method recaptures digital video from a first storage, when at least a portion of the digital video is also stored in a second storage. The method retrieves the digital video from the first storage. It identifies a set of metadata that is stored for the digital video in the second storage, and then determines whether there is an associated set of rules for processing this set of metadata when the digital video is re-captured from the first storage. If so, the method then stores the set of metadata with the retrieved digital video in a third storage.

In some embodiments, the method displays metadata associated with digital video. The method retrieves a set of metadata associated with the digital video. It then determines whether the set of metadata has an associated set of rules for displaying the set of metadata. If the set of metadata has such a set of associated rules, the method displays the set of metadata according to the set of associated rules.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appended claims. However, for purpose of explanation, several embodiments of the invention are set forth in the following figures.

FIG. 1 illustrates the conceptual attributes of a metadata set in some embodiments. FIG. 2 illustrates one manner for defining a metadata set with an implicit duration in an OS X® environment of Apple Computers, Inc.

FIG. 3 illustrates one manner for defining a metadata set with an explicit duration in an OS X® environment of Apple Computers, Inc.

FIG. 4 conceptually illustrates a rule-specifying structure.

FIG. 5 illustrates a process that some embodiments use to export a digital video presentation from an application database of a video editing application to a file that is stored on a storage.

FIG. 6 illustrates a process that some embodiments use during the recapture of the audio or video components of a digital video presentation.

FIG. 7 illustrates a process that some embodiments use to display the metadata associated with a digital video presentation.

FIG. 8 conceptually illustrates a computer system that can be used to implement some embodiments of the invention.

DETAILED DESCRIPTION OF THE INVENTION

In the following description, numerous details are set forth for purpose of explanation. However, one of ordinary skill in the art will realize that the invention may be practiced without the use of these specific details. In other instances, well-known structures and devices are shown in block diagram form in order not to obscure the description of the invention with unnecessary detail.

Some embodiments provide a method that processes metadata associated with video in a multi-state video processing system. This method specifies a set of rules for propagating the metadata between different states in the video processing system. It then propagates the metadata between the states based on the specified set of rules.

For instance, in some embodiments, the method exports digital video to a storage. This method initially determines whether a set of metadata of the digital video has an associated set of rules for exporting the set of metadata. If so, it then exports the set of metadata to the storage based on the set of rules.

In some embodiments, the method recaptures digital video from a first storage, when at least a portion of the digital video is also stored in a second storage. The method retrieves the digital video from the first storage. It identifies a set of metadata that is stored for the digital video in the second storage, and then determines whether there is an associated set of rules for processing this set of metadata when the digital video is re-captured from the first storage. If so, the method then stores the set of metadata with the retrieved digital video in a third storage.

In some embodiments, the method displays metadata associated with digital video. The method retrieves a set of metadata associated with the digital video. It then determines whether the set of metadata has an associated set of rules for displaying the set of metadata. If the set of metadata has such a set of associated rules, the method displays the set of metadata according to the set of associated rules.

The set of associated rules for a set of metadata can include only one rule, or it can include several rules. A metadata set's associated rule set can also include a rule specifying whether all or part of the metadata set should be exported, recaptured, or displayed. The default operation of some embodiments is to propagate a metadata set from one state to another when no set of associated rules exists for propagating the metadata set.

I. Metadata

Digital video metadata can include a variety of ancillary, such as time of day, timecode, camera settings, encoding cadence, film maker's comments, etc. The metadata is specified and stored differently in different embodiments. FIG. 1 illustrates the conceptual attributes of each metadata set in some embodiments. As shown in this figure, each metadata set has a scope 105 that specifies the metadata set's associated portion of digital video. This portion can be an entire file, one or more tracks in the file, or a portion of one or more tracks in the file. The scope can be specified explicitly or implicitly for a metadata set.

As shown in FIG. 1, the metadata set includes metadata content 110, which is the actual metadata in the set. Also, the metadata set includes a temporal duration 115. Different embodiments define the temporal duration differently. For instance, this duration is implicit in some embodiments, while it is explicit in other embodiments.

FIG. 2 conceptually illustrates one manner for defining a metadata set with an implicit duration in an OS X® environment of Apple Computers, Inc. Specifically, this figure illustrates a user data element 200 that has (1) a 4-character index 205, which is associated with a particular audio/video item (e.g., a/v track or movie), and (2) a metadata container 210 that stores the metadata. The duration of such a data element is implicitly assumed to match the duration of the a/v item associated with the item (i.e., the a/v item associated with the character index 205).

FIG. 3 conceptually illustrates one manner for defining a metadata set with an explicit duration in an OS X® environment of Apple Computers, Inc. Specifically, this figure illustrates that some embodiments define a metadata set by defining a metadata track, giving the track a specific start and end times, and associating the metadata track with one or more a/v tracks. In FIG. 3, the metadata track is track 305, which has start and end times 315 and 320, and which references a video track 310. In such embodiments, the actual metadata content is stored in the metadata track 305, and the duration of the metadata is derived from the specified start and end times and the sample times.

Some embodiments organize a metadata set (e.g., organize the metadata content that is stored in the metadata track 305 of FIG. 3 or the metadata container 210 of FIG. 2) in a rule-specifying structure that specifies one or more propagation rules for one or more metadata items in the metadata set. FIG. 4 conceptually illustrates one such rule-specifying structure. As shown in this figure, the rule-specifying structure 400 can specify several propagation rules for several propagation events for a metadata set. In some embodiments, the rule-specifying structure is an XML document that uses XML keys, codes and tags to specify different metadata items and rules for propagating these items.

II. Exporting Metadata

FIG. 5 illustrates a process 500 that some embodiments use to export a digital video presentation from an application database of a video editing application to a file (“destination file”) that is stored on a storage (e.g., a hard drive). This presentation has a video track that is a composite of one or more video tracks and/or special effects. It might also include an audio track that is a composite of one or more audio tracks and/or special effects. It also might include one or more sets of metadata. Each metadata set might include one or more metadata items (i.e., one or more subsets of metadata content). For instance, one metadata set might include the time. This metadata set's individual metadata items might be specified as the time of day, day, month, and year. Another metadata set might be the director's comments, and this set's individual metadata items might be the director's comments for each individual scene in the presentation.

As shown FIG. 5, the process initially stores (at 505) the video track of the presentation in the destination file. It then stores (at 510) the audio track of the presentation in the destination file. The process next determines (at 515) whether there is any metadata set associated with the a/v presentation. If not, the process ends.

Otherwise, the process selects (at 520) a metadata set. It then determines (at 525) whether the metadata set that it just selected at 520 includes any export rules regarding exporting the selected metadata set. If not, the process writes (at 530) the selected metadata set to the destination file, and then transitions to 545, which will be described below. Otherwise, the process determines (at 535) whether the export rule of the selected set prohibits exporting all metadata items in the selected metadata set. If so, the process transitions to 545, which will be described below.

On the other hand, if the export rule specifies that all or part of the metadata set should be exported, the process transitions to 540. At 540, the process writes the selected metadata set to the destination file according to the export rule identified at 525. In other words, at 540, the process writes to file the portion of the selected metadata set that this set's export rule designates for exporting. In some embodiments, the export rule for a metadata set can only specify whether the entire metadata set should be exported or not. Other embodiments, however, allow a metadata set's export rule to specify smaller portions of a metadata set for exporting. After 540, the process transitions to 545.

At 545, the process determines whether it has examined all metadata sets of the a/v presentation. If not, the process returns to 520 to select a metadata set of the presentation that it has not yet examined, and then repeats the above-described operations 525-545 for this selected set. When the process determines that it has examined all the metadata sets of the a/v presentation, it ends.

III. Regenerating Metadata

FIG. 6 illustrates a process 600 that some embodiments use during the recapture of the audio or video components of a digital video presentation. This presentation typically includes (1) a video track that is a composite of one or more video tracks and/or special effects, and (2) an audio track that is a composite of one or more audio tracks and/or special effects. It also might include one or more sets of metadata, each of which might include one or more metadata items.

In some embodiments, before the recapture operation, a video editing application's data storage includes some or all of the audio or video components plus metadata associated with these components. In some of these embodiments, the recapture operation imports from a first data storage (e.g., a tape or a file stored on disk) audio or video components of a digital video presentation into the application data storage of the video editing application. This recapture operation might be performed when the original a/v components in the application data storage are corrupted or are incomplete.

As shown FIG. 6, the process initially identifies (at 605) the time period in the a/v presentation for which it needs to recapture a/v data. From the first storage, it then (at 610) recaptures the a/v data within the period identified at 605 and stores the captured data in a media file (e.g., a file on disk). The process next determines (at 615) whether there is any metadata set in the application data storage (e.g., application database) that is associated with the portion of the a/v presentation that the process 600 is recapturing. If not, the process imports (at 650) the media file into the application database, and then ends. Some embodiments import a media file into the application data storage by simply linking the media file to the application data storage.

Otherwise, from the application data storage, the process selects (at 620) a metadata set that is associated with the portion of the a/v presentation that the process 600 is recapturing. It then determines (at 625) whether the metadata set that it just selected at 620 includes any recapture rules regarding processing the selected metadata set during a recapture operation. If not, the process writes (at 630) the selected metadata set to the media file, and then transitions to 645, which will be described below. Otherwise, the process determines (at 635) whether the recapture rule of the selected set prohibits the recapturing of all data in the selected metadata set. If the recapture rule prohibits the recapture of all data in the selected metadata set, the process transitions to 645, which will be described below.

On the other hand, if the recapture rule specifies that all or part of the metadata set should be recaptured, the process transitions to 640. At 640, the process writes the selected metadata set to the media file (that contains the a/v components that were recaptured at 610) according to the recapture rule identified at 625. In other words, at 640, the process writes to the media file the portion of the selected metadata set that this set's recapture rule designates for recapturing. In some embodiments, the recapture rule for a metadata set can only specify whether the entire metadata set should be recaptured or not. Other embodiments, however, allow a metadata set's recapture rule to specify smaller portions of a metadata set for recapturing. After 640, the process transitions to 645.

At 645, the process determines whether it has examined all metadata sets associated with the portion of the a/v presentation that the process 600 is recapturing. If not, the process returns to 620 to select another associated metadata set that it has not yet examined, and then repeats the above-described operations 625-645 for this selected set. The process transitions to 650 when the process determines (at 645) that it has examined all the metadata sets that are associated with the portion of the a/v presentation that the process 600 is recapturing. At 650, the process imports the media file into the application database, as mentioned above. Some embodiments import a media file into the application data storage by simply linking the media file to the application data storage. After 650, the process ends.

IV. Displaying Metadata

FIG. 7 illustrates a process 700 that some embodiments use to display the metadata associated with a digital video presentation. As with exporting and recapturing, the a/v presentation typically includes (1) a video track that is a composite of one or more video tracks and/or special effects, and (2) an audio track that is a composite of one or more audio tracks and/or special effects. It also might include one or more sets of metadata, each of which might include one or more metadata items. The metadata is stored initially in an application data storage along with the a/v presentation.

From the application data storage, the process initially loads (at 705) the a/v presentation for display. In some cases, this loading operation involves a rendering of the audio and video components of the presentation. The process next determines (at 710) whether there is any metadata set associated with the a/v presentation. If not, the process generates (at 715) an empty list for display, and then displays (at 750) this list. This empty list specifies that there is no metadata associated with the a/v presentation for display. After 750, the process ends.

If the process determines (at 710) that there is metadata associated with the a/v presentation, the process selects (at 720) a metadata set. It then determines (at 725) whether the metadata set that it just selected at 720 includes any display rules regarding displaying the selected metadata set. If not, the process writes (at 730) the selected metadata set to the display list, and then transitions to 745, which will be described below. Otherwise, the process determines (at 735) whether the display rule of the selected set prohibits the displaying of all the data in the selected metadata set. If the display rule prohibits the display of all the data in the selected metadata set, the process transitions to 745, which will be described below.

On the other hand, if the display rule specifies that all or part of the metadata set should be displayed, the process transitions to 740. At 740, the process writes the selected metadata set to the display list according to the display rule identified at 725. In other words, at 740, the process writes to the display list the portion of the selected metadata set that this set's display rule designates for displaying. In some embodiments, the display rule for a metadata set can only specify whether the entire metadata set should be displayed or not. Other embodiments, however, allow a metadata set's display rule to specify one or more subsets of a metadata set for display. After 740, the process transitions to 745.

At 745, the process determines whether it has examined all metadata sets of the a/v presentation. If not, the process returns to 720 to select a metadata set of the presentation that it has not yet examined, and then repeats the above-described operations 725-745 for this selected set. When the process determines (at 745) that it has examined all the metadata sets of the digital video presentation, it presents (at 750) the display list, and then ends.

FIG. 8 conceptually illustrates a computer system 800 that can be used to implement some embodiments of the invention. Computer system 800 includes a bus 805, a processor 810, a system memory 815, a read-only memory 820, a permanent storage device 825, input devices 830, and output devices 835. The bus 805 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the computer system 800. For instance, the bus 805 communicatively connects the processor 810 with the read-only memory 820, the system memory 815, and the permanent storage device 825.

From these various memory units, the processor 810 retrieves instructions to execute and data to process in order to execute the processes of the invention. The read-only-memory (ROM) 820 stores static data and instructions that are needed by the processor 810 and other modules of the computer system.

The permanent storage device 825, on the other hand, is read-and-write memory device. This device is a non-volatile memory unit that stores instruction and data even when the computer system 800 is off. Some embodiments of the invention use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 825.

Other embodiments use a removable storage device (such as a floppy disk or zip® disk, and its corresponding disk drive) as the permanent storage device. Like the permanent storage device 825, the system memory 815 is a read-and-write memory device. However, unlike storage device 825, the system memory is a volatile read-and-write memory, such as a random access memory. The system memory stores some of the instructions and data that the processor needs at runtime. In some embodiments, the invention's processes are stored in the system memory 815, the permanent storage device 825, and/or the read-only memory 820.

The bus 805 also connects to the input and output devices 830 and 835. The input devices enable the user to communicate information and select commands to the computer system. The input devices 830 include alphanumeric keyboards and cursor-controllers. The output devices 835 display images generated by the computer system. For instance, these devices display IC design layouts. The output devices include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD).

Finally, as shown in FIG. 8, bus 805 also couples computer 800 to a network 865 through a network adapter (not shown). In this manner, the computer can be a part of a network of computers (such as a local area network (“LAN”), a wide area network (“WAN”), or an Intranet) or a network of networks (such as the Internet). Any or all of the components of computer system 800 may be used in conjunction with the invention. However, one of ordinary skill in the art would appreciate that any other system configuration may also be used in conjunction with the present invention.

The invention described above provides a coherent, innovative approach for propagating metadata across different stages of the editing pipeline of a video editing application. This approach allows the editing application to maintain the metadata associated with a digital video presentation, and subsequently use this metadata. By allowing different rules for propagating different types of metadata, the invention provides a robust, flexible mechanism that allows an editor to specify different ways of propagating different metadata across different stages of the editing pipeline.

While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. Thus, one of ordinary skill in the art would understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims. 

1-52. (canceled)
 53. A method for displaying a composite video presentation, the method comprising: retrieving a composite video presentation that comprises a video portion and a set of metadata associated with the video portion, the set of metadata comprising an associated set of rules for displaying the metadata; and displaying the set of metadata along with the video portion on a display device when the set of rules specifies that the metadata should be displayed, wherein the set of metadata is not displayed with the video portion when the set of rules specifies that the set of metadata should not be displayed.
 54. The method of claim 53, wherein the composite video presentation comprises a second set of metadata that comprises an associated second set of rules for displaying the metadata, the method further comprising displaying the second set of metadata along with the video portion on the display device when the second set of rules specifies that the second set of metadata should be displayed.
 55. The method of claim 54, wherein the second set of metadata is not displayed with the video portion when the second set of rules specifies that the second set of metadata should not be displayed.
 56. The method of claim 53, wherein the method is performed by a video-editing application executing on a computer.
 57. The method of claim 56 further comprising editing a plurality of different video items to create the composite video presentation.
 58. The method of claim 53, wherein the set of metadata is associated with a particular duration of the video portion.
 59. The method of claim 58, wherein the composite video presentation comprises a plurality of different sets of metadata associated with different durations of the video portion.
 60. The method of claim 53, wherein displaying the set of metadata comprises: adding the set of metadata to a display list that specifies all metadata to display with the composite video presentation; and displaying the metadata in the display list.
 61. A non-transitory computer readable medium storing a computer program which when executed by at least one processing unit processes digital video for display, the computer program comprising sets of instructions for: retrieving a digital video presentation comprising a set of metadata associated with a portion of the video presentation; determining whether the set of metadata comprises an associated set of rules for displaying the metadata; when the metadata comprises an associated set of rules, determining whether to display the metadata according to the associated set of rules; and when the metadata does not include an associated set of rules, displaying the metadata along with the portion of the video presentation.
 62. The non-transitory computer readable medium of claim 61, wherein the computer program further comprises sets of instructions for, when the metadata comprises an associated set of rules: displaying the metadata along with the portion of the video presentation when the associated set of rules specifies to display the set of metadata; and displaying the video presentation without the metadata when the associated set of rules specifies not to display the set of metadata.
 63. The non-transitory computer readable medium of claim 61, wherein the set of metadata comprises information about the filming of video items used to create the video presentation.
 64. The non-transitory computer readable medium of claim 61, wherein the set of metadata is stored in a metadata track of the digital video presentation and has a start time and an end time in the presentation.
 65. The non-transitory computer readable medium of claim 61, wherein the set of metadata is stored in a user data element that includes a metadata container and an index that associates the set of metadata with a particular video item in the digital video presentation.
 66. The non-transitory computer readable medium of claim 65, wherein the set of metadata has a duration of the particular video item.
 67. The non-transitory computer readable medium of claim 61, wherein the set of rules is organized in an XML document.
 68. A non-transitory computer readable medium storing a computer program which when executed by at least one processing unit displays metadata associated with digital video, the computer program comprising sets of instructions for: displaying a portion of the digital video presentation, the portion having an associated set of metadata that comprises a set of display rules; identifying, based on the set of display rules for the set of metadata, a subset of the associated set of metadata to display with the digital video presentation; and while displaying the portion of the digital video presentation, displaying the identified subset of the metadata set.
 69. The non-transitory computer readable medium of claim 68, wherein the set of instructions for displaying the portion of the digital video presentation comprises a set of instructions for rendering the portion for display.
 70. The non-transitory computer readable medium of claim 68, wherein the digital video presentation comprises an audio portion.
 71. The non-transitory computer readable medium of claim 70, wherein the computer program further comprises a set of instructions for rendering the audio portion for output along with the video portion.
 72. The non-transitory computer readable medium of claim 68, wherein a second set of metadata is associated with a second portion of the digital video presentation, the second set of metadata comprising a second set of display rules, wherein the computer program further comprises sets of instructions for: displaying the second portion of the digital video presentation; identifying, based on the second set of display rules, a subset of the second set of metadata to display with the digital video presentation; and while displaying the second portion of the digital video presentation, displaying the identified subset of the second metadata set.
 73. The non-transitory computer readable medium of claim 72, wherein the first portion comprises a first video item and the second portion comprises a second video item. 